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Abstract — Global warming is an important issues all over the 
world those posses several effects on the environment. Several 
factors are responsible for global warming. One of the main 
issues is the release of carbon dioxide. In this paper, focussing on 
the aftermath factors (variables) on environment due to global 
warming. For this, classification and prediction technique is 
being used to classify the factors of global warming and then 
predict on future years in the atmosphere, and thereby affecting 
the environment. 

Index Terms — Classification algorithms, Data mining , Global 
Warming, Prediction algorithms. 


I. INTRODUCTION 

Data mining has attracted lot of attention in the research 
industry and in society as a whole in recent years, due to 
enormous availability of large amount of data and the need for 
turning such data into useful information and knowledge. The 
objective of this paper is to analyze such data and to resolve 
environmental research issues. 

Global Warming is an issue that keeps coming up recently 
with the increase of temperature and carbon dioxide level. 
Scientist believes that the main cause of this is because of the 
deforestation, pollution, carbon emission from transportation 
and factories that led to this global warming and climate 
change are terms for the observed century-scale rise in the 
average temperature of the Earth’s climate system and its 
related effects. 

Factors of global warming 
Greenhouse gases 
Variations in earth's orbit 
Deforestation 
Burning fossil fuels 

Prediction technique has been a prior one technique to 
immolate the pattern of global warming. There are several 
factors of global warming, but out of them only highly 
potential factors are considered in this paper. Data sets on 
these factors have been formulated in such a way that the 
impact of each and everyone can beagglomerated together to 
predict the effects of global warming in future. Algorithms 
such as regression (linear regression, multi-linear regression, 
and non-linear regression), classification, and density 
estimate have been used for prediction. Using these 
algorithms, comparisons will be done to summarise the 
aftermath effect of these factors on the environment. 
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II. Literature Review 

Data mining, also called Knowledge Discovery in Databases 
(KDD), is the field of discovering novel and potentially useful 
information from large amounts of data. The idea behind this 
paper is educational data mining which is still in its infancy. In 
case of global warming, we studied several papers from which 
we came to certain results that are: 

P. Kaur, M. Singh, G S Josan applied CHAID prediction 
model to analyze the interrelation between variables that are 
used to predict the slow learner in school education. The 
CHAID prediction model of student performance was 
constructed with seven class predictor variable. [13] 

K KAKU show that approximation of baseline of GHG 
emissions and reduction on poultry and swine industries of 
ASEAN 8 countries by adoption of GHG reduction scenario 
as waste management system instead of conventional system 
on GHG reduction; the fluctuation of current benchmark price 
of GHG and show that the stable economic benefit could not 
be expected; and to show economic benefits that broiler and 
swine industry in ASEAN 8 countries as developing countries 
could expect. [12] 

T-S Kwon, C M Lee, S-S Kim describe Prediction of 
abundance of beetles: In this study, a simple change in 
temperature will affect the abundance of beetles; they applied 
Quantitative prediction of abundance on the basis of 
temperature change; Statistical analysis is used on data set. 
[18] 

T-S Kwon, C M Lee, J Park, S-S Kim, J H Chun ,J H Sung 
describe Prediction of abundance of ants in this study 
included a simple change in temperature and didn't consider 
competition between species. When the range of temperature 
in the existing statistical methods was estimated, it is different 
from the result obtained in this study. [19] 

T-S Kwon, C M Lee, J Park, S-S Kim, J H Chun, J H Sung 
describes Prediction of abundance of spiders: They applied 
Quantitative prediction of abundance on the basis of 
temperature change; Take more than one species of spider 
distributed into three categories- increase, no change, 
decrease.[17] 

P C Austin, E W Steyerbery provide a method to determine 
the number of independent variables that can be included in a 
linear regression model and focused on accurate estimation of 
regression coefficients, standard errors, and confidence 
intervals. In contrast, linear regression models require only 
two SPV for adequate estimation of regression coefficients, 
standard errors, and confidence intervals. [14] 

H Wang, X Lua, P Xua, D Yuan provide the concept of 
CDHs/HDHs (cooling/heating degree hours) is introduced 
and weekly prediction models of total building power 
consumption are proposed by the way of multiple linear 
regression algorithm which is relatively simple and easy to 
understand. The prediction models are validated to have great 
accuracy and general applicability in the paper, offering 
reliable instructions to the building facility manager and 
relevant competent authorities in terms of decision making 
and policy implementation. [4] 
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A M Freije, T Hussain, E A Salman provided an information 
and increase awareness about three aspects of global warming 
including causes, impacts, and solutions; Therefore, the study 
has recommended integrating environmental concepts into 
the university curriculum for all students irrespective of their 
academic specialization in order to increase the 
environmental awareness.[1]. 

Reviewing all these papers, one thing can be estimated that 
the issue on global warming needs to be taken in a serious way 
and methods or techniques has to be developed to know 
global warming pattern better. This paper discuss on the 
factors that mostly affect the environment in a hazardous way. 
Predicting the patterns using the data set with prediction 
algorithms will certainly give an idea to the world that global 
warming is alarming issue that has to be taken in a concerning 
way. Most of the prediction techniques take into account only 
the temperature rise, but, this paper will focus more than that. 
Data set has to be categorised in a way that there can be 
separate results on separate factors of global warming that 
which can automatically gives everyone an idea on what to 
reduce and on what to take care of. One has to know what has 
to be stopped using and what not. This paper focuses on such 
agenda that will give results and will give a chance to redeem 
the nature and environment to being extinct. 

III. Methodology 
A. Proposed methodology 

A survey cum experimental methodology is used. Through 
extensive search of the literature and discussion with experts 
on global warming effects, a number of factors that are 
considered to have influence on the effects of global warming 
are identified. These influencing factors are categorized as 
input variables. For this work, recent real world data is 
collected from online (World Development bank). This data 
is then filtered out using manual techniques. Then data will 
transform into a standard format. After that, features and 
parameters selection is identified. Then analysis of identified 
parameters and implementation will be performing on the 
tool. After implementation results will produced and 
analyzed. Stepwise description of methodology used is 
represented with the help of flowchart as shown in Fig 1 



Fig 1. Flowchart of proposed work 


IV. Experimentation 

A. Database 

Use a numerical database in this experimental setup, collected 
the data from a various websites and converted that data into a 
relational database schema. 


Factors 

Years 

Variables 


Greenhouse 

gases(co2) 

2001-2011 

Domestic 

transport, 

End user level, 
Industries, 
Household waste, 
Burning fossil 

Road 

Rail 

Taxi 

Chemical 

Deforestation 

2001-2011 

Not plantation. 
Whether, 

Population 

Gross forest loss 

U N forest loss 


Tab 1. Dataset on factors of global warming 


B. Algorithms 

In the survey many algorithms are used for the prediction 
which helps to predict the most influence factors that are 
affecting the environment. An algorithm in data mining is a 
set of heuristics and calculations that create a model from 
data. The mining model that an algorithm creates from data 
can take various forms, including classification, regression, 
prediction, density estimate, and association rule. 

> Classification algorithms predict one or more discrete 

variables based on the other attributes in the dataset. 

> Regression algorithms predict one or more continuous 

numeric variables, such as profit or loss, based on 
the other attributes in the dataset. 

> Segmentation algorithms divide data into groups, or 

clusters, of the items that have similar properties. 

> Association algorithms find correlation between 

different attributes in a dataset. The most common 
application of this kind of algorithms is for creating 
association rules, which can be used in a market 
basket analysis. 

> Sequence analysis algorithms summarize frequent 

sequences or episodes in data, such as a series of 
clicks in a web site, or a series of log events 
preceding machine maintenance. 

One of the above mentioned algorithms will be use for 
prediction. 


V. Conclusion and Future work 

In this paper, classification techniques are used for prediction 
on the dataset of global warming, to predict and analyze 
factors affecting the environment as well most hazardous 
factors among them. This research helps everyone on what to 
reduce and on what to take care of. One has to know what has 
to be stopped using and what not. This paper focuses on such 
agenda that will give results and will give a chance to redeem 
the nature and wildlife environment to being extinct. This 
paper discuss on the factors that mostly affect the 
environment in a hazardous way. Predicting the patterns using 
the data set with classification algorithms will certainly give 
an idea to the world that global warming in alarming issue that 
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has to be taken in a concerning way, which further provide 
base for deciding special aid to them. In future, Integration of 
data mining techniques with DBMS and machine learning 
techniques is merged together on different datasets to find 
accuracy and predictions of desired results. Also, some new 
factors can be applied to improve lives, learning and retention 
capabilities among people. Hence the future of Global 
warming is promising for further research and can be applied 
in other areas like medicine, sports, education and share 
market due to the availability of huge databases. 
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